21 research outputs found
T-LESS: An RGB-D Dataset for 6D Pose Estimation of Texture-less Objects
We introduce T-LESS, a new public dataset for estimating the 6D pose, i.e.
translation and rotation, of texture-less rigid objects. The dataset features
thirty industry-relevant objects with no significant texture and no
discriminative color or reflectance properties. The objects exhibit symmetries
and mutual similarities in shape and/or size. Compared to other datasets, a
unique property is that some of the objects are parts of others. The dataset
includes training and test images that were captured with three synchronized
sensors, specifically a structured-light and a time-of-flight RGB-D sensor and
a high-resolution RGB camera. There are approximately 39K training and 10K test
images from each sensor. Additionally, two types of 3D models are provided for
each object, i.e. a manually created CAD model and a semi-automatically
reconstructed one. Training images depict individual objects against a black
background. Test images originate from twenty test scenes having varying
complexity, which increases from simple scenes with several isolated objects to
very challenging ones with multiple instances of several objects and with a
high amount of clutter and occlusion. The images were captured from a
systematically sampled view sphere around the object/scene, and are annotated
with accurate ground truth 6D poses of all modeled objects. Initial evaluation
results indicate that the state of the art in 6D object pose estimation has
ample room for improvement, especially in difficult cases with significant
occlusion. The T-LESS dataset is available online at cmp.felk.cvut.cz/t-less.Comment: WACV 201
CNOS: A Strong Baseline for CAD-based Novel Object Segmentation
We propose a simple three-stage approach to segment unseen objects in RGB
images using their CAD models. Leveraging recent powerful foundation models,
DINOv2 and Segment Anything, we create descriptors and generate proposals,
including binary masks for a given input RGB image. By matching proposals with
reference descriptors created from CAD models, we achieve precise object ID
assignment along with modal masks. We experimentally demonstrate that our
method achieves state-of-the-art results in CAD-based novel object
segmentation, surpassing existing approaches on the seven core datasets of the
BOP challenge by 19.8\% AP using the same BOP evaluation protocol. Our source
code is available at https://github.com/nv-nguyen/cnos
AssemblyHands: Towards Egocentric Activity Understanding via 3D Hand Pose Estimation
We present AssemblyHands, a large-scale benchmark dataset with accurate 3D
hand pose annotations, to facilitate the study of egocentric activities with
challenging hand-object interactions. The dataset includes synchronized
egocentric and exocentric images sampled from the recent Assembly101 dataset,
in which participants assemble and disassemble take-apart toys. To obtain
high-quality 3D hand pose annotations for the egocentric images, we develop an
efficient pipeline, where we use an initial set of manual annotations to train
a model to automatically annotate a much larger dataset. Our annotation model
uses multi-view feature fusion and an iterative refinement scheme, and achieves
an average keypoint error of 4.20 mm, which is 85% lower than the error of the
original annotations in Assembly101. AssemblyHands provides 3.0M annotated
images, including 490K egocentric images, making it the largest existing
benchmark dataset for egocentric 3D hand pose estimation. Using this data, we
develop a strong single-view baseline of 3D hand pose estimation from
egocentric images. Furthermore, we design a novel action classification task to
evaluate predicted 3D hand poses. Our study shows that having higher-quality
hand poses directly improves the ability to recognize actions.Comment: CVPR 2023. Project page: https://assemblyhands.github.io
In-Hand 3D Object Scanning from an RGB Sequence
We propose a method for in-hand 3D scanning of an unknown object with a
monocular camera. Our method relies on a neural implicit surface representation
that captures both the geometry and the appearance of the object, however, by
contrast with most NeRF-based methods, we do not assume that the camera-object
relative poses are known. Instead, we simultaneously optimize both the object
shape and the pose trajectory. As direct optimization over all shape and pose
parameters is prone to fail without coarse-level initialization, we propose an
incremental approach that starts by splitting the sequence into carefully
selected overlapping segments within which the optimization is likely to
succeed. We reconstruct the object shape and track its poses independently
within each segment, then merge all the segments before performing a global
optimization. We show that our method is able to reconstruct the shape and
color of both textured and challenging texture-less objects, outperforms
classical methods that rely only on appearance features, and that its
performance is close to recent methods that assume known camera poses.Comment: CVPR 202
BlenderProc: Reducing the Reality Gap with Photorealistic Rendering
BlenderProc is an open-source and modular pipeline for rendering photorealistic images of procedurally generated 3D scenes which can be used for training data-hungry deep learning models. The presented results on the tasks of instance segmentation and surface normal estimation suggest that our photorealistic training images reduce the gap between the synthetic training and real test domains, compared to less realistic training images combined with domain randomization. BlenderProc can be used to train models for various computer vision tasks such as semantic segmentation or estimation of depth, optical flow, and object pose. By offering standard modules for parameterizing and sampling materials, objects, cameras and lights, BlenderProc can simulate various real-world scenarios and provide means to systematically investigate the essential factors for sim2real transfer
CNOS: A Strong Baseline for CAD-based Novel Object Segmentation
International audienceWe propose a simple three-stage approach to segment unseen objects in RGB images using their CAD models. Leveraging recent powerful foundation models, DINOv2 and Segment Anything, we create descriptors and generate proposals, including binary masks for a given input RGB image. By matching proposals with reference descriptors created from CAD models, we achieve precise object ID assignment along with modal masks. We experimentally demonstrate that our method achieves state-of-the-art results in CAD-based novel object segmentation, surpassing existing approaches on the seven core datasets of the BOP challenge by 19.8% AP using the same BOP evaluation protocol. Our source code is available at https://github.com/nv-nguyen/cnos
Inflamed In Vitro Retina: Cytotoxic Neuroinflammation and Galectin-3 Expression.
BACKGROUND:Disease progression in retinal neurodegeneration is strongly correlated to immune cell activation, which may have either a neuroprotective or neurotoxic effect. Increased knowledge about the immune response profile and retinal neurodegeneration may lead to candidate targets for treatments. Therefore, we have used the explanted retina as a model to explore the immune response and expression of the immune modulator galectin-3 (Gal-3), induced by the cultivation per se and after additional immune stimulation with lipopolysaccharide (LPS), and how this correlates with retinal neurotoxicity. METHODS:Post-natal mouse retinas were cultured in a defined medium. One group was stimulated with LPS (100 ng/ml, 24 h). Retinal architecture, apoptotic cell death, and micro- and macroglial activity were studied at the time of cultivation (0 days in vitro (DIV)) and at 3, 4 and 7 DIV using morphological staining, biochemical- and immunohistochemical techniques. RESULTS:Our results show that sustained activation of macro- and microglia, characterized by no detectable cytokine release and limited expression of Gal-3, is not further inducing apoptosis additional to the axotomy-induced apoptosis in innermost nuclear layer. An elevated immune response was detected after LPS stimulation, as demonstrated primarily by release of immune mediators (i.e. interleukin 2 (IL-2), IL-6, KC/GRO (also known as CLCX1) and tumour necrosis factor-α (TNF-α)), increased numbers of microglia displaying morphologies of late activation stages as well as Gal-3 expression. This was accompanied with increased apoptosis in the two additional nuclear layers, and damage to retinal gross architecture. CONCLUSION:We demonstrate that an immune response characterized by sustained and increased release of cytokines, along with an increase in Gal-3 expression, is accompanied by significant increased neurotoxicity in the explanted retina. Further investigations using the current setting may lead to increased understanding on the mechanisms involved in neuronal loss in retinal neurodegenerations